Declarative Web data extraction and annotation

نویسندگان

  • Carlo Bernardoni
  • Giacomo Fiumara
  • Massimo Marchi
  • Alessandro Provetti
چکیده

We propose a software architecture for semantics-based annotation of data extracted from Web sources. Starting from the LiXto suite, which enables semi-automated extraction of XML data from regular documents, we present a solution for attaching background information to individual tags by means of so-called decorations. Decoration is carried out as an inferential activity in the formal context of Answer Set Programming. We discuss a motivating example that will serve as a validation to our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation

Although OWL provides a solid basis for many semantic web applications, it lacks sufficient declarative semantics for instance recognition to support automated semantic annotation. This omission prevents OWL from being a satisfactory ontology language for automated semantic annotation. This problem can be solved by adding declarative instance recognition semantics to OWL. Our declarative instan...

متن کامل

Web-Style Multimedia Annotations

Annotation of multimedia resources supports a wide range of applications, ranging from associating metadata with multimedia resources or parts of these resources, to the collaborative use of multimedia resources through the act of distributed authoring and annotation of resources. Most annotation frameworks, however, are based on a closed approach, where the annotations data is limited to the a...

متن کامل

Data extraction and annotation based on domain-specific ontology evolution for deep web

Deep web respond to a user query result records encoded in HTML files. Data extraction and data annotation, which are important for many applications, extracts and annotates the record from the HTML pages. We proposed an domain-specific ontology based data extraction and annotation technique; we first construct mini-ontology for specific domain according to information of query interface and qu...

متن کامل

Information Extraction from Unstructured and Ungrammatical Data Sources for Semantic Annotation

The internet has become an attractive avenue for global e-business, e-learning, knowledge sharing, etc. Due to continuous increase in the volume of web content, it is not practically possible for a user to extract information by browsing and integrating data from a huge amount of web sources retrieved by the existing search engines. The semantic web technology enables advancement in information...

متن کامل

Annotation for Query Result Records based on Domain-Specific Ontology

The World Wide Web is enriched with a large collection of data, scattered in deep web databases and web pages in unstructured or semi structured formats. Recently evolving customer friendly web applications need special data extraction mechanisms to draw out the required data from these deep web, according to the end user query and populate to the output page dynamically at the fastest rate. In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006